Dataset statistics
| Number of variables | 12 |
|---|---|
| Number of observations | 25074 |
| Missing cells | 153701 |
| Missing cells (%) | 51.1% |
| Duplicate rows | 38 |
| Duplicate rows (%) | 0.2% |
| Total size in memory | 2.3 MiB |
| Average record size in memory | 96.0 B |
Variable types
| Text | 11 |
|---|---|
| Categorical | 1 |
| Dataset has 38 (0.2%) duplicate rows | Duplicates |
surname has 5915 (23.6%) missing values | Missing |
occupation has 8896 (35.5%) missing values | Missing |
age has 8639 (34.5%) missing values | Missing |
civil_status has 14370 (57.3%) missing values | Missing |
nationality has 11760 (46.9%) missing values | Missing |
surname_household has 19434 (77.5%) missing values | Missing |
link has 4339 (17.3%) missing values | Missing |
birth_date has 17730 (70.7%) missing values | Missing |
lob has 15839 (63.2%) missing values | Missing |
employer has 22163 (88.4%) missing values | Missing |
observation has 24472 (97.6%) missing values | Missing |
Reproduction
| Analysis started | 2024-04-12 20:29:44.698208 |
|---|---|
| Analysis finished | 2024-04-12 20:29:46.714902 |
| Duration | 2.02 seconds |
| Software version | ydata-profiling vv4.7.0 |
| Download configuration | config.json |
surname
Text
MISSING 
| Distinct | 8120 |
|---|---|
| Distinct (%) | 42.4% |
| Missing | 5915 |
| Missing (%) | 23.6% |
| Memory size | 196.0 KiB |
Length
| Max length | 46 |
|---|---|
| Median length | 28 |
| Mean length | 7.1669189 |
| Min length | 2 |
Characters and Unicode
| Total characters | 137311 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 5034 ? |
|---|---|
| Unique (%) | 26.3% |
Sample
| 1st row | Breton |
|---|---|
| 2nd row | Vignat |
| 3rd row | Houy |
| 4th row | Violet |
| 5th row | Apelmeau |
| Value | Count | Frequency (%) |
| idem | 685 | 3.3% |
| le | 225 | 1.1% |
| femme | 147 | 0.7% |
| fe | 146 | 0.7% |
| martin | 101 | 0.5% |
| de | 68 | 0.3% |
| roux | 57 | 0.3% |
| faure | 52 | 0.3% |
| née | 47 | 0.2% |
| fme | 46 | 0.2% |
| Other values (7847) | 19124 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 16174 | 11.8% |
| a | 12396 | 9.0% |
| r | 11917 | 8.7% |
| u | 9218 | 6.7% |
| i | 9124 | 6.6% |
| o | 8730 | 6.4% |
| n | 8426 | 6.1% |
| l | 6263 | 4.6% |
| t | 6222 | 4.5% |
| d | 4292 | 3.1% |
| Other values (63) | 44549 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 137311 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 16174 | 11.8% |
| a | 12396 | 9.0% |
| r | 11917 | 8.7% |
| u | 9218 | 6.7% |
| i | 9124 | 6.6% |
| o | 8730 | 6.4% |
| n | 8426 | 6.1% |
| l | 6263 | 4.6% |
| t | 6222 | 4.5% |
| d | 4292 | 3.1% |
| Other values (63) | 44549 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 137311 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 16174 | 11.8% |
| a | 12396 | 9.0% |
| r | 11917 | 8.7% |
| u | 9218 | 6.7% |
| i | 9124 | 6.6% |
| o | 8730 | 6.4% |
| n | 8426 | 6.1% |
| l | 6263 | 4.6% |
| t | 6222 | 4.5% |
| d | 4292 | 3.1% |
| Other values (63) | 44549 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 137311 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 16174 | 11.8% |
| a | 12396 | 9.0% |
| r | 11917 | 8.7% |
| u | 9218 | 6.7% |
| i | 9124 | 6.6% |
| o | 8730 | 6.4% |
| n | 8426 | 6.1% |
| l | 6263 | 4.6% |
| t | 6222 | 4.5% |
| d | 4292 | 3.1% |
| Other values (63) | 44549 |
firstname
Text
| Distinct | 2456 |
|---|---|
| Distinct (%) | 9.9% |
| Missing | 144 |
| Missing (%) | 0.6% |
| Memory size | 196.0 KiB |
Length
| Max length | 30 |
|---|---|
| Median length | 27 |
| Mean length | 6.994986 |
| Min length | 1 |
Characters and Unicode
| Total characters | 174385 |
|---|---|
| Distinct characters | 68 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1697 ? |
|---|---|
| Unique (%) | 6.8% |
Sample
| 1st row | Cyrille |
|---|---|
| 2nd row | Zélie |
| 3rd row | Caroline |
| 4th row | Esther |
| 5th row | Thérèse |
| Value | Count | Frequency (%) |
| marie | 3721 | 13.7% |
| jean | 1792 | 6.6% |
| pierre | 1046 | 3.8% |
| jeanne | 851 | 3.1% |
| louis | 803 | 2.9% |
| françois | 636 | 2.3% |
| louise | 614 | 2.3% |
| anne | 583 | 2.1% |
| joseph | 466 | 1.7% |
| antoine | 451 | 1.7% |
| Other values (1480) | 16269 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 28927 | |
| i | 17828 | 10.2% |
| n | 15939 | 9.1% |
| a | 14436 | 8.3% |
| r | 13537 | 7.8% |
| o | 7232 | 4.1% |
| s | 7158 | 4.1% |
| t | 6600 | 3.8% |
| l | 6593 | 3.8% |
| u | 6079 | 3.5% |
| Other values (58) | 50056 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 174385 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 28927 | |
| i | 17828 | 10.2% |
| n | 15939 | 9.1% |
| a | 14436 | 8.3% |
| r | 13537 | 7.8% |
| o | 7232 | 4.1% |
| s | 7158 | 4.1% |
| t | 6600 | 3.8% |
| l | 6593 | 3.8% |
| u | 6079 | 3.5% |
| Other values (58) | 50056 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 174385 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 28927 | |
| i | 17828 | 10.2% |
| n | 15939 | 9.1% |
| a | 14436 | 8.3% |
| r | 13537 | 7.8% |
| o | 7232 | 4.1% |
| s | 7158 | 4.1% |
| t | 6600 | 3.8% |
| l | 6593 | 3.8% |
| u | 6079 | 3.5% |
| Other values (58) | 50056 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 174385 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 28927 | |
| i | 17828 | 10.2% |
| n | 15939 | 9.1% |
| a | 14436 | 8.3% |
| r | 13537 | 7.8% |
| o | 7232 | 4.1% |
| s | 7158 | 4.1% |
| t | 6600 | 3.8% |
| l | 6593 | 3.8% |
| u | 6079 | 3.5% |
| Other values (58) | 50056 |
occupation
Text
MISSING 
| Distinct | 2056 |
|---|---|
| Distinct (%) | 12.7% |
| Missing | 8896 |
| Missing (%) | 35.5% |
| Memory size | 196.0 KiB |
Length
| Max length | 48 |
|---|---|
| Median length | 43 |
| Mean length | 7.9103103 |
| Min length | 1 |
Characters and Unicode
| Total characters | 127973 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1386 ? |
|---|---|
| Unique (%) | 8.6% |
Sample
| 1st row | menuisier |
|---|---|
| 2nd row | prop re |
| 3rd row | domestique |
| 4th row | fe de chambre |
| 5th row | domestique |
| Value | Count | Frequency (%) |
| idem | 3702 | 18.2% |
| cultivateur | 1202 | 5.9% |
| néant | 922 | 4.5% |
| s.p | 656 | 3.2% |
| sans | 615 | 3.0% |
| cult | 549 | 2.7% |
| domestique | 479 | 2.3% |
| de | 470 | 2.3% |
| journalier | 450 | 2.2% |
| sp | 447 | 2.2% |
| Other values (1341) | 10894 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 16422 | |
| i | 12728 | 9.9% |
| r | 12157 | 9.5% |
| t | 8179 | 6.4% |
| u | 7687 | 6.0% |
| a | 7608 | 5.9% |
| n | 7592 | 5.9% |
| m | 6403 | 5.0% |
| s | 5860 | 4.6% |
| d | 5714 | 4.5% |
| Other values (63) | 37623 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 127973 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 16422 | |
| i | 12728 | 9.9% |
| r | 12157 | 9.5% |
| t | 8179 | 6.4% |
| u | 7687 | 6.0% |
| a | 7608 | 5.9% |
| n | 7592 | 5.9% |
| m | 6403 | 5.0% |
| s | 5860 | 4.6% |
| d | 5714 | 4.5% |
| Other values (63) | 37623 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 127973 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 16422 | |
| i | 12728 | 9.9% |
| r | 12157 | 9.5% |
| t | 8179 | 6.4% |
| u | 7687 | 6.0% |
| a | 7608 | 5.9% |
| n | 7592 | 5.9% |
| m | 6403 | 5.0% |
| s | 5860 | 4.6% |
| d | 5714 | 4.5% |
| Other values (63) | 37623 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 127973 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 16422 | |
| i | 12728 | 9.9% |
| r | 12157 | 9.5% |
| t | 8179 | 6.4% |
| u | 7687 | 6.0% |
| a | 7608 | 5.9% |
| n | 7592 | 5.9% |
| m | 6403 | 5.0% |
| s | 5860 | 4.6% |
| d | 5714 | 4.5% |
| Other values (63) | 37623 |
age
Text
MISSING 
| Distinct | 253 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 8639 |
| Missing (%) | 34.5% |
| Memory size | 196.0 KiB |
Length
| Max length | 14 |
|---|---|
| Median length | 2 |
| Mean length | 1.9643444 |
| Min length | 1 |
Characters and Unicode
| Total characters | 32284 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 90 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | 25 |
|---|---|
| 2nd row | 30 |
| 3rd row | 24 |
| 4th row | 24 |
| 5th row | 49 |
| Value | Count | Frequency (%) |
| 2 | 367 | 2.2% |
| 6 | 358 | 2.1% |
| 8 | 354 | 2.1% |
| 18 | 343 | 2.0% |
| 4 | 341 | 2.0% |
| 5 | 340 | 2.0% |
| 7 | 335 | 2.0% |
| 3 | 335 | 2.0% |
| 9 | 326 | 1.9% |
| 30 | 326 | 1.9% |
| Other values (128) | 13513 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 4496 | |
| 2 | 4255 | |
| 3 | 3923 | |
| 4 | 3629 | |
| 5 | 3319 | |
| 6 | 2891 | |
| 7 | 2209 | |
| 0 | 1935 | |
| 8 | 1817 | |
| 9 | 1442 | 4.5% |
| Other values (21) | 2368 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 32284 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 4496 | |
| 2 | 4255 | |
| 3 | 3923 | |
| 4 | 3629 | |
| 5 | 3319 | |
| 6 | 2891 | |
| 7 | 2209 | |
| 0 | 1935 | |
| 8 | 1817 | |
| 9 | 1442 | 4.5% |
| Other values (21) | 2368 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 32284 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 4496 | |
| 2 | 4255 | |
| 3 | 3923 | |
| 4 | 3629 | |
| 5 | 3319 | |
| 6 | 2891 | |
| 7 | 2209 | |
| 0 | 1935 | |
| 8 | 1817 | |
| 9 | 1442 | 4.5% |
| Other values (21) | 2368 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 32284 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 4496 | |
| 2 | 4255 | |
| 3 | 3923 | |
| 4 | 3629 | |
| 5 | 3319 | |
| 6 | 2891 | |
| 7 | 2209 | |
| 0 | 1935 | |
| 8 | 1817 | |
| 9 | 1442 | 4.5% |
| Other values (21) | 2368 |
civil_status
Categorical
MISSING 
| Distinct | 6 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 14370 |
| Missing (%) | 57.3% |
| Memory size | 196.0 KiB |
| Garçon | |
|---|---|
| Fille | |
| Homme marié | |
| Femme mariée | |
| Veuve |
Length
| Max length | 12 |
|---|---|
| Median length | 11 |
| Mean length | 7.8179185 |
| Min length | 4 |
Characters and Unicode
| Total characters | 83683 |
|---|---|
| Distinct characters | 18 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Garçon |
|---|---|
| 2nd row | Fille |
| 3rd row | Fille |
| 4th row | Femme mariée |
| 5th row | Femme mariée |
Common Values
| Value | Count | Frequency (%) |
| Garçon | 2824 | 11.3% |
| Fille | 2823 | 11.3% |
| Homme marié | 2140 | 8.5% |
| Femme mariée | 2113 | 8.4% |
| Veuve | 512 | 2.0% |
| Veuf | 292 | 1.2% |
| (Missing) | 14370 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| garçon | 2824 | |
| fille | 2823 | |
| homme | 2140 | |
| marié | 2140 | |
| femme | 2113 | |
| mariée | 2113 | |
| veuve | 512 | 3.4% |
| veuf | 292 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| m | 12759 | |
| e | 12618 | |
| r | 7077 | |
| a | 7077 | |
| i | 7076 | |
| l | 5646 | |
| o | 4964 | 5.9% |
| F | 4936 | 5.9% |
| é | 4253 | 5.1% |
| 4253 | 5.1% | |
| Other values (8) | 13024 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 83683 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| m | 12759 | |
| e | 12618 | |
| r | 7077 | |
| a | 7077 | |
| i | 7076 | |
| l | 5646 | |
| o | 4964 | 5.9% |
| F | 4936 | 5.9% |
| é | 4253 | 5.1% |
| 4253 | 5.1% | |
| Other values (8) | 13024 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 83683 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| m | 12759 | |
| e | 12618 | |
| r | 7077 | |
| a | 7077 | |
| i | 7076 | |
| l | 5646 | |
| o | 4964 | 5.9% |
| F | 4936 | 5.9% |
| é | 4253 | 5.1% |
| 4253 | 5.1% | |
| Other values (8) | 13024 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 83683 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| m | 12759 | |
| e | 12618 | |
| r | 7077 | |
| a | 7077 | |
| i | 7076 | |
| l | 5646 | |
| o | 4964 | 5.9% |
| F | 4936 | 5.9% |
| é | 4253 | 5.1% |
| 4253 | 5.1% | |
| Other values (8) | 13024 |
nationality
Text
MISSING 
| Distinct | 73 |
|---|---|
| Distinct (%) | 0.5% |
| Missing | 11760 |
| Missing (%) | 46.9% |
| Memory size | 196.0 KiB |
Length
| Max length | 31 |
|---|---|
| Median length | 9 |
| Mean length | 7.2859396 |
| Min length | 1 |
Characters and Unicode
| Total characters | 97005 |
|---|---|
| Distinct characters | 48 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 40 ? |
|---|---|
| Unique (%) | 0.3% |
Sample
| 1st row | française |
|---|---|
| 2nd row | française |
| 3rd row | française |
| 4th row | française |
| 5th row | française |
| Value | Count | Frequency (%) |
| française | 8017 | |
| idem | 4454 | |
| français | 377 | 2.8% |
| francaise | 277 | 2.1% |
| polonaise | 53 | 0.4% |
| id | 23 | 0.2% |
| espagnole | 15 | 0.1% |
| belge | 14 | 0.1% |
| polonais | 9 | 0.1% |
| portugaise | 8 | 0.1% |
| Other values (58) | 90 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 17495 | |
| i | 13268 | |
| e | 12946 | |
| s | 8802 | |
| n | 8797 | |
| r | 8718 | |
| f | 8576 | |
| ç | 8395 | |
| d | 4493 | 4.6% |
| m | 4474 | 4.6% |
| Other values (38) | 1041 | 1.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 97005 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| a | 17495 | |
| i | 13268 | |
| e | 12946 | |
| s | 8802 | |
| n | 8797 | |
| r | 8718 | |
| f | 8576 | |
| ç | 8395 | |
| d | 4493 | 4.6% |
| m | 4474 | 4.6% |
| Other values (38) | 1041 | 1.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 97005 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| a | 17495 | |
| i | 13268 | |
| e | 12946 | |
| s | 8802 | |
| n | 8797 | |
| r | 8718 | |
| f | 8576 | |
| ç | 8395 | |
| d | 4493 | 4.6% |
| m | 4474 | 4.6% |
| Other values (38) | 1041 | 1.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 97005 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| a | 17495 | |
| i | 13268 | |
| e | 12946 | |
| s | 8802 | |
| n | 8797 | |
| r | 8718 | |
| f | 8576 | |
| ç | 8395 | |
| d | 4493 | 4.6% |
| m | 4474 | 4.6% |
| Other values (38) | 1041 | 1.1% |
MISSING 
| Distinct | 4126 |
|---|---|
| Distinct (%) | 73.2% |
| Missing | 19434 |
| Missing (%) | 77.5% |
| Memory size | 196.0 KiB |
Length
| Max length | 52 |
|---|---|
| Median length | 32 |
| Mean length | 7.3673759 |
| Min length | 3 |
Characters and Unicode
| Total characters | 41552 |
|---|---|
| Distinct characters | 66 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 3414 ? |
|---|---|
| Unique (%) | 60.5% |
Sample
| 1st row | Ferazzi |
|---|---|
| 2nd row | Machol |
| 3rd row | Desbois |
| 4th row | Desbroper |
| 5th row | Allemant |
| Value | Count | Frequency (%) |
| vve | 60 | 1.0% |
| ve | 55 | 0.9% |
| veuve | 55 | 0.9% |
| le | 43 | 0.7% |
| martin | 34 | 0.6% |
| de | 26 | 0.4% |
| née | 23 | 0.4% |
| thomas | 18 | 0.3% |
| faure | 16 | 0.3% |
| roux | 16 | 0.3% |
| Other values (4175) | 5830 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 4814 | 11.6% |
| a | 3864 | 9.3% |
| r | 3539 | 8.5% |
| u | 2812 | 6.8% |
| o | 2651 | 6.4% |
| i | 2640 | 6.4% |
| n | 2559 | 6.2% |
| l | 2081 | 5.0% |
| t | 1843 | 4.4% |
| s | 1369 | 3.3% |
| Other values (56) | 13380 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 41552 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 4814 | 11.6% |
| a | 3864 | 9.3% |
| r | 3539 | 8.5% |
| u | 2812 | 6.8% |
| o | 2651 | 6.4% |
| i | 2640 | 6.4% |
| n | 2559 | 6.2% |
| l | 2081 | 5.0% |
| t | 1843 | 4.4% |
| s | 1369 | 3.3% |
| Other values (56) | 13380 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 41552 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 4814 | 11.6% |
| a | 3864 | 9.3% |
| r | 3539 | 8.5% |
| u | 2812 | 6.8% |
| o | 2651 | 6.4% |
| i | 2640 | 6.4% |
| n | 2559 | 6.2% |
| l | 2081 | 5.0% |
| t | 1843 | 4.4% |
| s | 1369 | 3.3% |
| Other values (56) | 13380 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 41552 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 4814 | 11.6% |
| a | 3864 | 9.3% |
| r | 3539 | 8.5% |
| u | 2812 | 6.8% |
| o | 2651 | 6.4% |
| i | 2640 | 6.4% |
| n | 2559 | 6.2% |
| l | 2081 | 5.0% |
| t | 1843 | 4.4% |
| s | 1369 | 3.3% |
| Other values (56) | 13380 |
link
Text
MISSING 
| Distinct | 937 |
|---|---|
| Distinct (%) | 4.5% |
| Missing | 4339 |
| Missing (%) | 17.3% |
| Memory size | 196.0 KiB |
Length
| Max length | 48 |
|---|---|
| Median length | 42 |
| Mean length | 7.2057391 |
| Min length | 1 |
Characters and Unicode
| Total characters | 149411 |
|---|---|
| Distinct characters | 73 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 573 ? |
|---|---|
| Unique (%) | 2.8% |
Sample
| 1st row | sa fe |
|---|---|
| 2nd row | sa fe |
| 3rd row | le fils |
| 4th row | le fils |
| 5th row | le fils |
| Value | Count | Frequency (%) |
| chef | 4792 | |
| fils | 3314 | |
| femme | 3260 | |
| fille | 3175 | |
| sa | 2810 | |
| leur | 2264 | 7.0% |
| idem | 2178 | 6.7% |
| de | 1718 | 5.3% |
| ménage | 1299 | 4.0% |
| épouse | 936 | 2.9% |
| Other values (502) | 6770 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 28574 | |
| f | 15541 | |
| l | 12872 | 8.6% |
| 11781 | 7.9% | |
| m | 11660 | 7.8% |
| i | 10302 | 6.9% |
| s | 9058 | 6.1% |
| a | 5624 | 3.8% |
| d | 5588 | 3.7% |
| c | 5351 | 3.6% |
| Other values (63) | 33060 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 149411 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 28574 | |
| f | 15541 | |
| l | 12872 | 8.6% |
| 11781 | 7.9% | |
| m | 11660 | 7.8% |
| i | 10302 | 6.9% |
| s | 9058 | 6.1% |
| a | 5624 | 3.8% |
| d | 5588 | 3.7% |
| c | 5351 | 3.6% |
| Other values (63) | 33060 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 149411 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 28574 | |
| f | 15541 | |
| l | 12872 | 8.6% |
| 11781 | 7.9% | |
| m | 11660 | 7.8% |
| i | 10302 | 6.9% |
| s | 9058 | 6.1% |
| a | 5624 | 3.8% |
| d | 5588 | 3.7% |
| c | 5351 | 3.6% |
| Other values (63) | 33060 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 149411 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 28574 | |
| f | 15541 | |
| l | 12872 | 8.6% |
| 11781 | 7.9% | |
| m | 11660 | 7.8% |
| i | 10302 | 6.9% |
| s | 9058 | 6.1% |
| a | 5624 | 3.8% |
| d | 5588 | 3.7% |
| c | 5351 | 3.6% |
| Other values (63) | 33060 |
birth_date
Text
MISSING 
| Distinct | 158 |
|---|---|
| Distinct (%) | 2.2% |
| Missing | 17730 |
| Missing (%) | 70.7% |
| Memory size | 196.0 KiB |
Length
| Max length | 9 |
|---|---|
| Median length | 4 |
| Mean length | 3.9978214 |
| Min length | 1 |
Characters and Unicode
| Total characters | 29360 |
|---|---|
| Distinct characters | 25 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 40 ? |
|---|---|
| Unique (%) | 0.5% |
Sample
| 1st row | 1905 |
|---|---|
| 2nd row | 1908 |
| 3rd row | 1878 |
| 4th row | 1906 |
| 5th row | 1908 |
| Value | Count | Frequency (%) |
| 1901 | 138 | 1.9% |
| 1905 | 133 | 1.8% |
| 1902 | 126 | 1.7% |
| 1903 | 124 | 1.7% |
| 1891 | 121 | 1.6% |
| 1904 | 121 | 1.6% |
| 1890 | 119 | 1.6% |
| 1897 | 118 | 1.6% |
| 1907 | 118 | 1.6% |
| 1896 | 117 | 1.6% |
| Other values (150) | 6112 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 8674 | |
| 8 | 6325 | |
| 9 | 4476 | |
| 0 | 2051 | 7.0% |
| 7 | 1605 | 5.5% |
| 6 | 1482 | 5.0% |
| 2 | 1455 | 5.0% |
| 5 | 1217 | 4.1% |
| 4 | 1043 | 3.6% |
| 3 | 1005 | 3.4% |
| Other values (15) | 27 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 29360 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1 | 8674 | |
| 8 | 6325 | |
| 9 | 4476 | |
| 0 | 2051 | 7.0% |
| 7 | 1605 | 5.5% |
| 6 | 1482 | 5.0% |
| 2 | 1455 | 5.0% |
| 5 | 1217 | 4.1% |
| 4 | 1043 | 3.6% |
| 3 | 1005 | 3.4% |
| Other values (15) | 27 | 0.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 29360 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1 | 8674 | |
| 8 | 6325 | |
| 9 | 4476 | |
| 0 | 2051 | 7.0% |
| 7 | 1605 | 5.5% |
| 6 | 1482 | 5.0% |
| 2 | 1455 | 5.0% |
| 5 | 1217 | 4.1% |
| 4 | 1043 | 3.6% |
| 3 | 1005 | 3.4% |
| Other values (15) | 27 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 29360 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1 | 8674 | |
| 8 | 6325 | |
| 9 | 4476 | |
| 0 | 2051 | 7.0% |
| 7 | 1605 | 5.5% |
| 6 | 1482 | 5.0% |
| 2 | 1455 | 5.0% |
| 5 | 1217 | 4.1% |
| 4 | 1043 | 3.6% |
| 3 | 1005 | 3.4% |
| Other values (15) | 27 | 0.1% |
lob
Text
MISSING 
| Distinct | 2923 |
|---|---|
| Distinct (%) | 31.7% |
| Missing | 15839 |
| Missing (%) | 63.2% |
| Memory size | 196.0 KiB |
Length
| Max length | 37 |
|---|---|
| Median length | 33 |
| Mean length | 7.7566865 |
| Min length | 1 |
Characters and Unicode
| Total characters | 71633 |
|---|---|
| Distinct characters | 83 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 2293 ? |
|---|---|
| Unique (%) | 24.8% |
Sample
| 1st row | St Eloy de Gy - Cher |
|---|---|
| 2nd row | idem |
| 3rd row | Chateauroux |
| 4th row | Orléans |
| 5th row | idem |
| Value | Count | Frequency (%) |
| idem | 3390 | |
| st | 581 | 4.4% |
| orléans | 295 | 2.3% |
| 289 | 2.2% | |
| la | 254 | 1.9% |
| loiret | 170 | 1.3% |
| de | 146 | 1.1% |
| le | 139 | 1.1% |
| et | 123 | 0.9% |
| paris | 111 | 0.8% |
| Other values (2860) | 7603 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 9874 | |
| i | 6390 | 8.9% |
| m | 4464 | 6.2% |
| d | 4408 | 6.2% |
| r | 4063 | 5.7% |
| a | 4038 | 5.6% |
| n | 3906 | 5.5% |
| 3866 | 5.4% | |
| l | 3460 | 4.8% |
| o | 3285 | 4.6% |
| Other values (73) | 23879 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 71633 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 9874 | |
| i | 6390 | 8.9% |
| m | 4464 | 6.2% |
| d | 4408 | 6.2% |
| r | 4063 | 5.7% |
| a | 4038 | 5.6% |
| n | 3906 | 5.5% |
| 3866 | 5.4% | |
| l | 3460 | 4.8% |
| o | 3285 | 4.6% |
| Other values (73) | 23879 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 71633 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 9874 | |
| i | 6390 | 8.9% |
| m | 4464 | 6.2% |
| d | 4408 | 6.2% |
| r | 4063 | 5.7% |
| a | 4038 | 5.6% |
| n | 3906 | 5.5% |
| 3866 | 5.4% | |
| l | 3460 | 4.8% |
| o | 3285 | 4.6% |
| Other values (73) | 23879 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 71633 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 9874 | |
| i | 6390 | 8.9% |
| m | 4464 | 6.2% |
| d | 4408 | 6.2% |
| r | 4063 | 5.7% |
| a | 4038 | 5.6% |
| n | 3906 | 5.5% |
| 3866 | 5.4% | |
| l | 3460 | 4.8% |
| o | 3285 | 4.6% |
| Other values (73) | 23879 |
employer
Text
MISSING 
| Distinct | 1087 |
|---|---|
| Distinct (%) | 37.3% |
| Missing | 22163 |
| Missing (%) | 88.4% |
| Memory size | 196.0 KiB |
Length
| Max length | 52 |
|---|---|
| Median length | 49 |
| Mean length | 7.2998969 |
| Min length | 1 |
Characters and Unicode
| Total characters | 21250 |
|---|---|
| Distinct characters | 78 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 902 ? |
|---|---|
| Unique (%) | 31.0% |
Sample
| 1st row | Dupuis |
|---|---|
| 2nd row | patron |
| 3rd row | Bourgeois |
| 4th row | divén |
| 5th row | Usine d'ambere |
| Value | Count | Frequency (%) |
| patron | 659 | 17.2% |
| idem | 607 | 15.8% |
| divers | 106 | 2.8% |
| de | 95 | 2.5% |
| patronne | 84 | 2.2% |
| p | 46 | 1.2% |
| m | 46 | 1.2% |
| cie | 31 | 0.8% |
| et | 29 | 0.8% |
| po | 27 | 0.7% |
| Other values (1200) | 2105 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2364 | 11.1% |
| r | 1880 | 8.8% |
| a | 1770 | 8.3% |
| n | 1592 | 7.5% |
| i | 1476 | 6.9% |
| t | 1454 | 6.8% |
| o | 1426 | 6.7% |
| d | 1031 | 4.9% |
| 924 | 4.3% | |
| p | 895 | 4.2% |
| Other values (68) | 6438 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 21250 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 2364 | 11.1% |
| r | 1880 | 8.8% |
| a | 1770 | 8.3% |
| n | 1592 | 7.5% |
| i | 1476 | 6.9% |
| t | 1454 | 6.8% |
| o | 1426 | 6.7% |
| d | 1031 | 4.9% |
| 924 | 4.3% | |
| p | 895 | 4.2% |
| Other values (68) | 6438 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 21250 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 2364 | 11.1% |
| r | 1880 | 8.8% |
| a | 1770 | 8.3% |
| n | 1592 | 7.5% |
| i | 1476 | 6.9% |
| t | 1454 | 6.8% |
| o | 1426 | 6.7% |
| d | 1031 | 4.9% |
| 924 | 4.3% | |
| p | 895 | 4.2% |
| Other values (68) | 6438 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 21250 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 2364 | 11.1% |
| r | 1880 | 8.8% |
| a | 1770 | 8.3% |
| n | 1592 | 7.5% |
| i | 1476 | 6.9% |
| t | 1454 | 6.8% |
| o | 1426 | 6.7% |
| d | 1031 | 4.9% |
| 924 | 4.3% | |
| p | 895 | 4.2% |
| Other values (68) | 6438 |
observation
Text
MISSING 
| Distinct | 310 |
|---|---|
| Distinct (%) | 51.5% |
| Missing | 24472 |
| Missing (%) | 97.6% |
| Memory size | 196.0 KiB |
Length
| Max length | 137 |
|---|---|
| Median length | 44 |
| Mean length | 9.8272425 |
| Min length | 1 |
Characters and Unicode
| Total characters | 5916 |
|---|---|
| Distinct characters | 74 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 260 ? |
|---|---|
| Unique (%) | 43.2% |
Sample
| 1st row | m |
|---|---|
| 2nd row | m |
| 3rd row | v |
| 4th row | m |
| 5th row | m |
| Value | Count | Frequency (%) |
| veuve | 97 | 8.6% |
| idem | 90 | 8.0% |
| et | 24 | 2.1% |
| de | 24 | 2.1% |
| marié | 23 | 2.0% |
| femme | 19 | 1.7% |
| 19 | 1.7% | |
| du | 19 | 1.7% |
| sait | 18 | 1.6% |
| lire | 18 | 1.6% |
| Other values (385) | 779 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 898 | |
| 528 | 8.9% | |
| i | 464 | 7.8% |
| a | 343 | 5.8% |
| r | 340 | 5.7% |
| u | 317 | 5.4% |
| n | 309 | 5.2% |
| v | 252 | 4.3% |
| m | 246 | 4.2% |
| d | 244 | 4.1% |
| Other values (64) | 1975 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 5916 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 898 | |
| 528 | 8.9% | |
| i | 464 | 7.8% |
| a | 343 | 5.8% |
| r | 340 | 5.7% |
| u | 317 | 5.4% |
| n | 309 | 5.2% |
| v | 252 | 4.3% |
| m | 246 | 4.2% |
| d | 244 | 4.1% |
| Other values (64) | 1975 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 5916 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 898 | |
| 528 | 8.9% | |
| i | 464 | 7.8% |
| a | 343 | 5.8% |
| r | 340 | 5.7% |
| u | 317 | 5.4% |
| n | 309 | 5.2% |
| v | 252 | 4.3% |
| m | 246 | 4.2% |
| d | 244 | 4.1% |
| Other values (64) | 1975 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 5916 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 898 | |
| 528 | 8.9% | |
| i | 464 | 7.8% |
| a | 343 | 5.8% |
| r | 340 | 5.7% |
| u | 317 | 5.4% |
| n | 309 | 5.2% |
| v | 252 | 4.3% |
| m | 246 | 4.2% |
| d | 244 | 4.1% |
| Other values (64) | 1975 |
| surname | firstname | occupation | age | civil_status | nationality | surname_household | link | birth_date | lob | employer | observation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Breton | Cyrille | menuisier | 25 | Garçon | française | NaN | NaN | NaN | NaN | NaN | NaN |
| 1 | Vignat | Zélie | prop re | 30 | NaN | française | NaN | sa fe | NaN | NaN | NaN | NaN |
| 2 | Houy | Caroline | domestique | 24 | Fille | française | NaN | NaN | NaN | NaN | NaN | NaN |
| 3 | Violet | Esther | fe de chambre | 24 | Fille | française | NaN | NaN | NaN | NaN | NaN | NaN |
| 4 | Apelmeau | Thérèse | domestique | 49 | Femme mariée | française | NaN | NaN | NaN | NaN | NaN | NaN |
| 5 | de Chaumont | Mathilde | profess | 30 | Femme mariée | française | NaN | sa fe | NaN | NaN | NaN | NaN |
| 6 | de Chaumont | Georges | NaN | 11 | Garçon | française | NaN | le fils | NaN | NaN | NaN | NaN |
| 7 | de Chaumont | Henro | NaN | 8 | Garçon | française | NaN | le fils | NaN | NaN | NaN | NaN |
| 8 | de Chaumont | Gaston | NaN | 5 | Garçon | française | NaN | le fils | NaN | NaN | NaN | NaN |
| 9 | Voisin | Anne | domestique | 24 | Fille | française | NaN | NaN | NaN | NaN | NaN | NaN |
| surname | firstname | occupation | age | civil_status | nationality | surname_household | link | birth_date | lob | employer | observation | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 25064 | NaN | NaN | NaN | NaN | NaN | francaise | NaN | NaN | 1867 | NaN | NaN | NaN |
| 25065 | NaN | NaN | NaN | NaN | NaN | francaise | NaN | NaN | 1873 | NaN | NaN | NaN |
| 25066 | NaN | NaN | NaN | NaN | NaN | francaise | NaN | NaN | 1884 | NaN | NaN | NaN |
| 25067 | NaN | NaN | NaN | NaN | NaN | idem | NaN | chef | 1897 | Ay | NaN | NaN |
| 25068 | NaN | NaN | NaN | NaN | NaN | idem | NaN | chef | 1897 | NaN | patron | NaN |
| 25069 | NaN | NaN | NaN | NaN | NaN | NaN | Thierif | NaN | NaN | NaN | NaN | NaN |
| 25070 | NaN | NaN | NaN | NaN | NaN | NaN | Painchaud | NaN | NaN | NaN | NaN | NaN |
| 25071 | NaN | NaN | NaN | NaN | NaN | NaN | Gaston ve née | NaN | NaN | NaN | NaN | NaN |
| 25072 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | frère | 1897 | NaN | patron | NaN |
| 25073 | NaN | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 1914 | francaise | NaN | NaN |
Most frequently occurring
| surname | firstname | occupation | age | civil_status | nationality | surname_household | link | birth_date | lob | employer | observation | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | Chambonet | Marie | NaN | NaN | Fille | NaN | NaN | fille | NaN | NaN | NaN | NaN | 3 |
| 17 | Jaffeux fille | Jeanne | NaN | NaN | Fille | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 21 | Mailhut | Marie | NaN | NaN | Fille | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 3 |
| 0 | Campot | Jeanne | idem | NaN | Fille | NaN | NaN | sa fille | NaN | NaN | NaN | NaN | 2 |
| 2 | Chazeaux | Jean | NaN | NaN | Garçon | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2 |
| 3 | Cherleuille | Marie | idem | NaN | Fille | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2 |
| 4 | Coquin | Pierre | NaN | NaN | Garçon | NaN | NaN | NaN | NaN | NaN | NaN | NaN | 2 |
| 5 | Corre | Hélène | NaN | NaN | Fille | NaN | NaN | fille | NaN | NaN | NaN | NaN | 2 |
| 6 | Corre | Louise | NaN | NaN | Femme mariée | NaN | NaN | sa femme | NaN | NaN | NaN | femme Foucaud | 2 |
| 7 | Dixneuf | Marie | NaN | 12 | Fille | NaN | NaN | NaN | NaN | idem | NaN | NaN | 2 |